---
title: "Cartesia"
description: "Tools for interacting with Cartesia Voice AI services including text-to-speech and voice localization"
---

**CartesiaTools** enable an Agent to perform text-to-speech, list available voices, and localize voices using [Cartesia](https://docs.cartesia.ai/).

## Prerequisites

The following example requires the `cartesia` library and an API key.

```bash
pip install cartesia
```

```bash
export CARTESIA_API_KEY="your_api_key_here"
```

## Example

```python
from agno.agent import Agent
from agno.tools.cartesia import CartesiaTools
from agno.utils.audio import write_audio_to_file

# Initialize Agent with Cartesia tools
agent = Agent(
    name="Cartesia TTS Agent",
    description="An agent that uses Cartesia for text-to-speech.",
    tools=[CartesiaTools()],
)

response = agent.run(
    """Generate a simple greeting using Text-to-Speech:

    Say "Welcome to Cartesia, the advanced  speech synthesis platform. This speech is generated by an agent."
    """
)

# Save the generated audio
if response.audio:
    write_audio_to_file(audio=response.audio[0].content, filename="tmp/greeting.mp3")

```

## Advanced Example: Translation and Voice Localization

This example demonstrates how to translate text, analyze emotion, localize a new voice, and generate a voice note using CartesiaTools.

```python
from textwrap import dedent
from agno.agent import Agent
from agno.models.openai import OpenAIChat
from agno.tools.cartesia import CartesiaTools
from agno.utils.audio import write_audio_to_file

agent_instructions = dedent(
    """Follow these steps SEQUENTIALLY to translate text and generate a localized voice note:
    1. Identify the text to translate and the target language from the user request.
    2. Translate the text accurately to the target language.
    3. Analyze the emotion conveyed by the translated text.
    4. Call `list_voices` to retrieve available voices.
    5. Select a base voice matching the language and emotion.
    6. Call `localize_voice` to create a new localized voice.
    7. Call `text_to_speech` to generate the final audio.
    """
)

agent = Agent(
    name="Emotion-Aware Translator Agent",
    description="Translates text, analyzes emotion, selects a suitable voice, creates a localized voice, and generates a voice note (audio file) using Cartesia TTS tools.",
    instructions=agent_instructions,
    model=OpenAIChat(id="gpt-5-mini"),
    tools=[CartesiaTools(enable_localize_voice=True)],  
    )

agent.print_response(
    "Translate 'Hello! How are you? Tell me more about the weather in Paris?' to French and create a voice note."
)
response = agent.run_response

if response.audio:
    write_audio_to_file(
        response.audio[0].base64_audio,
        filename="french_weather.mp3",
    )
```

## Toolkit Params

| Parameter                | Type             | Default                                    | Description                                                                                       |
|--------------------------|------------------|--------------------------------------------|---------------------------------------------------------------------------------------------------|
| `api_key`                | `str`            | `None`                                     | The Cartesia API key for authentication. If not provided, uses the `CARTESIA_API_KEY` env variable. |
| `model_id`               | `str`            | `sonic-2`                                  | The model ID to use for text-to-speech.                                                         |
| `default_voice_id`       | `str`            | `78ab82d5-25be-4f7d-82b3-7ad64e5b85b2`      | The default voice ID to use for text-to-speech and localization.                                 |
| `enable_text_to_speech` | `bool`           | `True`                                     | Enable text-to-speech functionality.                                                             |
| `enable_list_voices`    | `bool`           | `True`                                     | Enable listing available voices functionality.                                                   |
| `enable_localize_voice` | `bool`           | `False`                                    | Enable voice localization functionality.                                                         |

## Toolkit Functions

| Function         | Description                                                                                                                                                                 |
|------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `list_voices`    | List available voices from Cartesia.                 |
| `text_to_speech` | Converts text to speech.  |
| `localize_voice` | Create a new localized voice. |

## Developer Resources

- View [Tools](https://github.com/agno-agi/agno/blob/main/libs/agno/agno/tools/cartesia.py)
- View [Cookbook](https://github.com/agno-agi/agno/tree/main/cookbook/tools/cartesia_tools.py)
